"For Sweden" (rallybeetle)
11/14/2018 at 18:36 • Filed to: NOT A CAR, Oppo Questions | 1 | 27 |
I’m starting a data collection and analysis project, and this is one hour of collected data in CVS form; 277.3 MB. I’m going to need more space.
I’ll put this in SQL or MongoDB eventually, and I should have access to an off-site server room. Is there any reason not to get a 2U or 4U hot swap drive server for network storage?
A big server: 4U
Highlander-Datsuns are Forever
> For Sweden
11/14/2018 at 18:44 | 1 |
Parlez vous francias?
BJ
> Highlander-Datsuns are Forever
11/14/2018 at 18:46 | 0 |
Oui, pourquoi? :)
For Sweden
> Highlander-Datsuns are Forever
11/14/2018 at 18:48 | 3 |
Oui
Je commence un projet de collecte et d’analyse de données. C’est une heure de données collectées sous forme CVS; 277,3 MB. Je vais avoir besoin de plus d’espace.
Je vais éventuellement mettre cela en SQL ou MongoDB, et je devrais avoir accès à une salle de serveurs hors site. Existe-t-il une raison de ne pas utiliser un serveur de lecteur remplaçable à chaud 2U ou 4U pour le stockage réseau?
Just Jeepin'
> For Sweden
11/14/2018 at 18:49 | 0 |
Not Mongo if you value your data. Stick with Postgres.
BJ
> For Sweden
11/14/2018 at 18:50 | 0 |
First question: Why csv?
Second question: what are you planning on storing in Mongo? Csv, or proper objects? If the data can be denormalized, sql is a good option, too. But choose your database wisely...
And to answer your question: I dunno. What’s the value of the data you’re collecting right now? If it’s disposable, replaceable , or just test, get a cheap home NAS with 3TB or something. That gets you roughly 2-3 weeks of data until you figure this out. You can keep that rollong window of data going until something bigger comes in.
Are you thinking of going with G oogle or Amazon cloud storage eventually, or do you want/need to keep the data inhouse?
For Sweden
> BJ
11/14/2018 at 18:53 | 0 |
CSV because I needed to make sure I could dump the data into something. I’ll start migrating it to a real database soon. It’s a timestamp, two identifying variables, and a hex string.
It’s not priceless, but I’d rather not lose it; especially with access to the off-site server room.
BJ
> For Sweden
11/14/2018 at 18:56 | 0 |
Time-series data, then? MongoDB would be a better fit than SQL, then. I’m guessing the hex string is a hash of some sort? I’ve only just started playing with MongoDB, and it’s powerful. But updating objects, or worse sub-obj ects, is no fun unless you have a nice data library like Spri ng Data.
For Sweden
> BJ
11/14/2018 at 19:00 | 0 |
To be extremely technical, it’s ADS-B data in AVR format
http://wiki.modesbeast.com/Mode-S_Beast:Data_Output_Formats
BJ
> For Sweden
11/14/2018 at 19:02 | 0 |
Attends, c’était une HEURE de données ? Fuck! J’avais compris un jour... Au moins, fais un fichier rolling par heure et zip tes fichiers pour sauver de la place en attendant une solution finale.
For Sweden
> BJ
11/14/2018 at 19:06 | 1 |
But I haven’t purchased a WinRAR license!
Highlander-Datsuns are Forever
> For Sweden
11/14/2018 at 19:07 | 1 |
Oh merde.
Spamfeller Loves Nazi Clicks
> For Sweden
11/14/2018 at 19:09 | 4 |
Oh. My. Gods.
Are you trying to hurt me? Hi. I’m RootWyrm. I know more about storage than.. well.. I get to lecture NetApp employees on how to do NFSv4 + KRB5 + AD so yeah AND IBM people as to why they are wrong about how SVC StorWize works.
And quite frankly, ARE YOU OUT OF YOUR FLIPPING MIND? What, you want to spend the next six months of your life chasing phantom SAS disconnects and SATA resets and bitching about how godawful the performance is all the time ? Because seriously. That’s what is going to happen. That is EXACTLY what is going to happen because that is what ALWAYS happens.
Look, you don’t have 25 years of systems integration under your belt. And that’s the shit it takes me weeks to fight through with nearly 30 years of experience . The ONLY time you build your own storage “solution” is when the only thing you are concerned with is building your own storage “solution,” and not even then. Seriously. I hate dealing with it for a reason. And I have manufacturers that custom make backplanes, midplanes, and cables.
Not only that, but 277MB/hr of CSV going into any sort of OLTP or database? Yeaaaah that ain’t small demand, man. That’s 6.6TB/day multiplied by your retention equals You Are Not Doing This With SATA. Especially not with RAID write amplification.
If it’s PURELY cold store? Then sure. That’s fine. Just buy a Synology. Specifically I would recommend the RS4017xs+ or the RS3618xs both of which can be expanded with the RX1217 .
If this is going to be live storage? Uh, yeah, you do not want to know what it costs.
For Sweden
> Spamfeller Loves Nazi Clicks
11/14/2018 at 19:10 | 1 |
This is the enthusiast answer I was looking for
Spamfeller Loves Nazi Clicks
> For Sweden
11/14/2018 at 19:13 | 0 |
Well, except I get to legitimately claim to be a pro. So, you know, I totally will sell you a solution that can work this data live and full realtime . And it will do what it says on the tin better than anything you can buy from a Tier 1 vendor.
It will also cost you at least $95,000USD before any additional integration services.
Call me?
XJDano
> For Sweden
11/14/2018 at 19:15 | 1 |
Mobile storage shed.
https://stlouis.craigslist.org/tro/d/shed-trailer/6726814271.html
For Sweden
> Spamfeller Loves Nazi Clicks
11/14/2018 at 19:16 | 1 |
I would, but it’s closer to 6 GB/day, and I .zip’d it down to 2.4 GB/day. I have a few years before I need to build a cooling tower for the data center.
Future Heap Owner
> For Sweden
11/14/2018 at 19:39 | 1 |
Yeah, I was just gonna s ay “zip it for now and don’t worry about it for a couple weeks”
Spamfeller Loves Nazi Clicks
> For Sweden
11/14/2018 at 19:49 | 0 |
Bah, you’re right, I mixed up scales. (I’m too used to working in units like ‘250GB/hr’ or ‘ 250GB/min.’)
So hey that drops it to like.. uh.. hang on.. 6 * 90 = ~0.6TB soooo depends on the data rate since your format is going to force full block for every event. (Ugh. Can’t people chunk data worth a shit any more?)
But yeah, I could fit 2 years into 2U with 180 days at real-time query performance. I mean hell, fixing the scale properly, you could also drop down to an RS1219+ for cold storage. Configure RS1219+ exactly like this though and use 7200RPM not “WD Red” or NAS drives. If it ain’t 7200RPM it’s shit.
Use 6 drives (10TB) or 8 drives (12TB) and SHR for single disk protection, SHR2 or RAID6 for 2 disk protection . Feel free to drop down to 500GB or 1TB, but do not increase past 4TB disks. SSD is also an option but I wouldn’t recommend it for the actual workload as opposed to the “RootWyrm used the wrong damn scale” workload.
For Sweden
> Spamfeller Loves Nazi Clicks
11/14/2018 at 19:56 | 0 |
I know people that use Teledyne quick access recorders and download every flight. I know that multi-TB-per-day workflow is way beyond my abilities.
BJ
> For Sweden
11/14/2018 at 20:02 | 1 |
Yeah, you're fucked. The license police are coming for your ass.
Spamfeller Loves Nazi Clicks
> For Sweden
11/14/2018 at 20:18 | 0 |
Oh, it’s well within your abilities. Just well outside of your budget. ;)
BJ
> Just Jeepin'
11/14/2018 at 20:24 | 0 |
Hmmm, but they're not really designed to solve the same problem. What have you been through with MongoDB? I'm playing with it right now.
Just Jeepin'
> BJ
11/14/2018 at 20:39 | 0 |
I haven’t, but historically they’ve made some very poor decisions in the name of performance, like confirming a write has succeeded before the traffic even leaves the client.
It’s better than it used to be, but in general it’s not a good sign when the joke about your database is that migrating away is trivial: just give it enough time and all your data will be gone anyway.
TheRealBicycleBuck
> For Sweden
11/14/2018 at 20:43 | 1 |
I’m looking at aircraft right now and ADSB In/Out functionality is important! There’s a relatively inexpensive solution that replaces the left side nav light and integrates with your current GPS and transponder, but I’d rather not have to spend another $2-3k to get a plane 2020 compliant.
BJ
> Just Jeepin'
11/14/2018 at 20:45 | 0 |
Interesting. My personal world is generally based on standard sql databases, so I don't have much experience. I'm building a personal project based on it, but using Spring Data should allow me to switch away if I need to quite easily.
For Sweden
> TheRealBicycleBuck
11/14/2018 at 21:48 | 1 |
But not crashing into other airplanes is good
TheRealBicycleBuck
> For Sweden
11/14/2018 at 22:11 | 0 |
Agreed. I’d rather get a plane with it already installed than buy something non-compliant and have to hire someone to do the work.